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Abstract 

Background: Drug metabolism and pharmacokinetic (DMPK) assessment has come to occupy a place of interest 
during the early stages of drug discovery today. Computer-based methods are slowly gaining ground in this area 
and are often used as initial tools to eliminate compounds likely to present uninteresting pharmacokinetic profiles 
and unacceptable levels of toxicity from the list of potential drug candidates, hence cutting down the cost of the 
discovery of a drug. 

Results: In the present study, we present an in silico assessment of the DMPK profile of our recently published 
natural products database of 1,859 unique compounds derived from 224 species of medicinal plants from the 
Cameroonian forest. In this analysis, we have used 46 computed physico-chemical properties or molecular 
descriptors to predict the absorption, distribution, metabolism and elimination (ADME) of the compounds. This 
survey demonstrated that about 50% of the compounds within the Cameroonian medicinal plant and natural 
products (CamMedNP) database are compliant, having properties which fall within the range of ADME properties 
of >95% of currently known drugs, while >73% of the compounds have <2 violations. Moreover, about 72% of the 
compounds within the corresponding 'drug-like' subset showed compliance. 

Conclusions: In addition to the previously verified levels of 'drug-likeness' and the diversity and the wide range of 
measured biological activities, the compounds in the CamMedNP database show interesting DMPK profiles and, 
hence, could represent an important starting point for hit/lead discovery from medicinal plants in Africa. 
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Background and drugs [8,9]. In addition, they generally contain more 

Natural products (NPs) play an increasingly important oxygen atoms and less aromatic atoms on average, when 

role in drug discovery today [1-5], both serving as drugs compared with 'drug-like' molecules [8-11]. It is needless 

and as templates for the design of nature-inspired medi- to say that NPs sometimes fail the famous 'drug-likeness' 

cines [3,6]. In fact, it has been reported that a significant test due to the often bulky nature of naturally occurring 

proportion of drugs that undergo clinical trials are either metabolites [11]. 

naturally occurring or are derived from NPs [7]. What It is also worth mentioning that designing drug-like 

characterises NPs are their richness in stereogenic centres molecules having interesting pharmacokinetic properties 

and coverage of segments of chemical space which are is an important paradigm in drug discovery programs 

typically not occupied by a majority of synthetic molecules [12,13]. This entails the search for lead compounds which 

can be easily orally absorbed, easily transported to their 
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that may produce adverse side effects. The ensemble of 
the above properties is often referred to as absorption, dis- 
tribution, metabolism and elimination (ADME) properties, 
or better still ADMET or ADME/T or ADMETox (i.e. if 
toxicity criteria are also taken into consideration). 

Computer-based in silico approaches for the prediction 
of ADMET profiles of drug leads at early stages of drug dis- 
covery are increasingly gaining ground [14-16]. This could 
be explained by the relative cost advantage added to the 
time factor, when compared to standard experimental ap- 
proaches for ADMET profiling [17,18]. On these grounds, 
several theoretical methods for the determination of 
ADMET parameters have been developed and imple- 
mented in a number of currently available software for drug 
discovery protocols [19-22], even though the predictions 
are sometimes disappointing [23]. Such software often 
make use of quantitative structure-activity relationships 
[22-24] or knowledge-base methods [25-27]. The goal has 
been to considerably cut down on the currently very high 
cost of discovery of a drug [17]. A promising lead is often 
defined as a compound which combines potency with an 
admirable ADMET profile. As such, compounds with un- 
favourably predicted pharmacokinetic profiles are either 
completely dismissed from the list of potential drug candi- 
dates (even if they prove to be highly potent) or the drug 
metabolism and pharmacokinetics (DMPK) properties are 
Tine tuned' in order to improve their chances of making it 
to clinical trials [28]. This explains why the 'graveyard' of 
very highly potent compounds which do not make it to 
clinical trials keeps filling up, to the extent that the process 
of drug discovery often presents the challenge of either 
resorting to new leads or resurrecting' some buried leads 
with the view of fine-tuning their ADMET profiles. 

In a recent paper, we have presented a database of 1,859 
compounds derived from the Cameroonian flora, Camer- 
oonian medicinal plant and natural products (CamMedNP), 
the compounds being predicted to be sufficiently orally 
available and diverse to be employed in lead discovery pro- 
grams [29]. Additional arguments in favour of the use of 
this database are the wide range of the previously observed 
biological activities of the compounds and the wide range 
of ailments being treated by traditional medicine with the 
help of the herbs from which the compounds have been 
derived [29,30]. 

Numerous drugs at a late stage of pharmaceutical devel- 
opment and many more lead compounds fail due to ad- 
verse pharmacokinetic properties [18]. It is, therefore, 
important to incorporate the prediction of the ADME 
properties into the lead compound selection, by means of 
molecular descriptors. A molecular descriptor is often de- 
fined as a structural or physico-chemical property of a mol- 
ecule or part of a molecule, for example the logarithm of 
the w-octanol/water partition coefficient (log P), molar 
weight (MW) and total polar surface area. A number of 



relevant molecular properties (descriptors) are often used 
to help predict the pharmacokinetic behaviour of potential 
drug leads. In the present study, we have carried out an in 
silico assessment of the ADMET profile of the CamMedNP 
database by the use of computed molecular descriptors cur- 
rently implemented in a wide range of software tools as in- 
dicators of the pharmacokinetic properties of a large 
proportion of currently known drugs. 

Methods 

Data sources and generation of 3D structures 

The plant sources, geographical collection sites, chemical 
structures of pure compounds and their measured bio- 
logical activities were retrieved from literature sources and 
have been previously described [29]. The three-dimensional 
(3D) structures were generated using the builder module of 
MOE [31], and energy minimization was subsequently 
carried out using the MMFF94 [32] until a gradient of 
0.01 kcal/mol was reached. 

Initial treatment of chemical structures and calculation of 
ADMET-related descriptors 

The 1,859 low-energy 3D chemical structures in the 
CamMedNP library were saved in mol2 format and initially 
treated with LigPrep [33], distributed by Schrodinger, Inc. 
(New York, USA). This implementation was carried out 
with the graphical user interface of the Maestro software 
package (New York, USA) [34], using the OPLS force field 
[35-37] . Protonation states at biologically relevant pH were 
correctly assigned (group I metals in simple salts were dis- 
connected, strong acids were deprotonated and strong 
bases protonated, while topological duplicates and explicit 
hydrogens were added). All molecular modelling was car- 
ried out on a Linux workstation (San Francisco, USA) with 
a 3.5 GHz Intel Core2 Duo processor (Santa Clara, USA). 
A set of the ADMET-related properties (a total of 46 mo- 
lecular descriptors) were calculated using the QikProp pro- 
gram (New York, USA) [21] running in normal mode. 
QikProp generates physically relevant descriptors and uses 
them to perform ADMET predictions. An overall ADME- 
compliance score, drug-likeness parameter (indicated by 
#stars), was used to assess the pharmacokinetic profiles of 
the compounds within the CamMedNP library. The #stars 
parameter indicates the number of property descriptors 
computed by QikProp, which falls outside the optimum 
range of values for 95% of known drugs. The methods 
implemented were developed by Jorgensen et al. [38-40]. 
Among the calculated descriptors are the total solvent- 
accessible molecular surface, «S mo i in A 2 (probe radius 1.4 
A; range for 95% of drugs is 300 to 1,000 A 2 ); the hydro- 
phobic portion of the solvent-accessible molecular surface, 
Smoi,hfob in A 2 (probe radius 1.4 A; range for 95% of drugs 
is 0 to 750 A 2 ); the total volume of molecule enclosed by 
solvent-accessible molecular surface, V mo i in A 3 (probe 
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radius 1.4 A; range for 95% of drugs is 500 to 2,000 A 3 ); 
the logarithm of aqueous solubility, log«S wat (range for 95% 
of drugs is -6.0 to 0.5) [36,38]; the logarithm of predicted 
binding constant to human serum albumin, \ogK HSA (range 
for 95% of drugs is -1.5 to 1.2) [41]; the logarithm of pre- 
dicted blood/brain barrier partition coefficient, log B/B 
(range for 95% of drugs is -3.0 to 1.0) [42-44]; the pre- 
dicted apparent Caco-2 cell membrane permeability 
(BLP Caco _ 2 ) in Boehringer-Ingelheim scale, in nm/s (range 
for 95% of drugs is <5 low, >100 high) [45-47]; the pre- 
dicted apparent Madin-Darby canine kidney (MDCK) cell 
permeability in nm s _1 (<25 poor, >500 great) [46]; the 
index of cohesion interaction in solids, Ind coh , calcu- 
lated from the number of hydrogen bond acceptors 
(HBA), hydrogen bond donors (HBD) and the surface 
area accessible to the solvent («S mo i) by the relation 
Ind coh = HBA x v / HBD/S mol (0.0 to 0.05 for 95% of 
drugs) [40]; the globularity descriptor, Glob = (47ZT 2 )/«S mo i, 
where r is the radius of the sphere whose volume is equal 
to the molecular volume (0.75 to 0.95 for 95% of drugs); 
the predicted polarizability, QP po irz (13.0 to 70.0 for 95% 
of drugs); the predicted IC 50 value for blockage of HERG 
K + channels, logHERG (concern <-5) [48,49]; the pre- 
dicted skin permeability, log/<p (-8.0 to -1.0 for 95% of 
drugs) [50,51]; and the number of likely metabolic reac- 
tions, #metab (range for 95% of drugs is 0 to 15). 



Results and discussion 

Overall DMPK compliance of the CamMedNP library 

The 24 most relevant molecular descriptors calculated by 
QikProp are used to determine the #star parameter [52]. 
A plot of the #stars parameter (on the #-axis) against the 
corresponding counts (on the j-axis) in the CamMedNP 
is shown within the same set of axes with those of the 
'drug-like; lead-like' and 'fragment-like' standard subsets, 
Figure 1. The criteria for the respective standard subsets 
were defined as MW < 500, log P < 5, HBD < 5, HBA < 
10 [14]; 150 < MW < 350, log P < 4, HBD < 3, HBA < 6 
[53-55] and MW < 250, -2 < log P < 3, HBD < 3, HBA < 
6, NRB < 3 [56]. QikProp was unable to compute the 
ADMET descriptors for 25 compounds out of the total li- 
brary due to limitations that were not clear to us. Of the 
remaining 1,834 compounds, 48.04% showed #star = 0, 
while 74.21% had #star < 2. Among the 1,122 compounds 
of the drug-like subset, 79.12% had pharmacokinetic de- 
scriptors within the acceptable range for 95% of known 
drugs, while 97.33% showed #stars < 2. The lead-like and 
fragment-like subsets were, respectively, 81.15% and 
55.56% compliant for all of the 24 most relevant com- 
puted descriptors. The mean values for 19 selected com- 
puted descriptors have been shown in Table 1 for all four 
compound libraries, while the percentage compliances for 
14 selected ADMET-related descriptors are shown in 




#stars 

Figure 1 Distribution curves for #stars within the CamMedNP 
library and subsets. Blue = CamMedNP library, red = drug-like 
subset, green = lead-like subset and violet = fragment-like subset. 



Table 2. The mean values and percentage compliances in- 
dicate a high probability of finding drug leads within the 
CamMedNP compound library. 

Bioavailability prediction 

The bioavailability of a compound depends on the pro- 
cesses of absorption and liver first-pass metabolism [57]. 
The absorption, in turn, depends on the solubility and 
permeability of the compound, as well as on the interac- 
tions with transporters and metabolizing enzymes in the 
gut wall. The computed parameters used to assess oral ab- 
sorption are the predicted aqueous solubility, log«S wat , the 
conformation-independent predicted aqueous solubility, 
CI log<S wat , the predicted qualitative human oral absorp- 
tion, the predicted % human oral absorption and com- 
pliance to Jorgensen's famous 'Rule of Three' (ro3). The 
solubility calculation procedure implemented depends on 
the similarity property space between the given molecule 
and its most similar analogue within the experimental 
training set used to develop the model implemented in 
QikProp, i.e. if the similarity is <0.9, then the QikProp pre- 
dicted value is taken; otherwise, the predicted property, 
P P red> is adjusted such that 



P p red=SP exp + (l-S)P QP 



(1) 



where S is the similarity and P e x P and P QP are, respectively, 
the experimental and QikProp predictions for the most 
similar molecule within the training set. In Equation 1, if 
5=1, then the predicted property is equal to the mea- 
sured experimental property of the training set com- 
pound. According to Jorgensen's ro3, if a compound 
complies to all or some of the rules (log«S wat > -5.7, 
BLPcaco-2 > 22 nm/s and number of primary metabolites < 
seven), then it is more likely to be orally available. The dis- 
tribution curves for two of the three determinants for the 
ro3 (log«S wat and BLP Caco _ 2 ) are shown in Figure 2A,B. In 
general, 47.22% of the CamMedNP library was compliant 
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Table 1 Average pharmacokinetic property distributions of total CamMedNP library in comparison with various 
subsets 





Library name 
CamMedNP 


Drug-like 


Subset 
Lead-like 


Fragment-like 


Lib. size 3 


1,859 


1,122 


520 


81 


No. compl. b 


881 


807 


422 


45 


MW (Da) c 


426.70 


330.33 


276.46 


195.75 


Log P d 


4.18 


2.82 


2.24 


1.42 


HBA e 


5.85 


5.18 


4.39 


3.53 


HBD f 


2.39 


1.40 


1.29 


0.81 


NRB 9 


5.31 


4.51 


3.39 


2.06 


Log B/B u 


-1.30 


-0.77 


-0.64 


-0.36 


BIPcaco-2 (nm s" 1 )' 


1,199.37 


1,216.27 


1,207.91 


1,577.16 


5 mo i (A 2 ) j 


696.28 


569.69 


501.44 


393.20 


5 m ol,hfob (A 2 ) 


409.24 


280.66 


200.91 


131.61 


1/mol (A 3 )' 


1,304.41 


1,024.64 


870.65 


645.16 


LogS wat (S in mol L" 1 ) m 


-5.11 


-3.87 


-3.13 


-1.77 


Log/(KsA 


0.46 


0.15 


-0.05 


-0.44 


MDCK° 


661.25 


671.02 


663.83 


907.31 


lnd c p oh 


0.013 


0.009 


0.009 


0.006 


Glob q 


0.84 


0.86 


0.88 


0.92 


QPpolrz (A 3 ) r 


42.47 


33.56 


28.23 


19.86 


LogHERG 5 


-4.64 


-4.41 


-4.22 


-3.40 


Log/^ 


-2.96 


-2.86 


-2.89 


-2.63 


#metab u 


5.56 


4.62 


3.57 


2.07 



a Size or number of compounds in library; b number of compounds with #star = 0; c molar weight (range for 95% of drugs is 130 to 725 Da); logarithm of 
partitioning coefficient between n-octanol and water phases (range for 95% of drugs is -2 to 6); e number of hydrogen bonds accepted by the molecule (range for 
95% of drugs is 2 to 20); Viumber of hydrogen bonds donated by the molecule (range for 95% of drugs is 0 to 6); 9 number of rotatable bonds (range for 95% of drugs 
is 0 to 15); logarithm of predicted blood/brain barrier partition coefficient (range for 95% of drugs is -3.0 to 1.0); 'predicted apparent Caco-2 cell membrane 
permeability in Boehringer-lngelheim scale, in nm/s (range for 95% of drugs is <5 low, >100 high); j total solvent-accessible molecular surface, in A 2 (probe radius 1.4 A; 
range for 95% of drugs is 300 to 1,000 A 2 ); hydrophobic portion of the solvent-accessible molecular surface, in A 2 (probe radius 1.4 A; range for 95% of drugs is 0 to 
750 A 2 ); 'total volume of molecule enclosed by solvent-accessible molecular surface, in A 3 (probe radius 1.4 A; range for 95% of drugs is 500 to 2,000 A 3 ); logarithm of 
aqueous solubility in g/dm 3 (range for 95% of drugs is -6.0 to 0.5); "logarithm of predicted binding constant to human serum albumin (range for 95% of drugs is -1.5 
to 1 .2); °predicted apparent MDCK cell permeability in nm/sec (<25 poor, >500 great); p index of cohesion interaction in solids (0.0 to 0.05 for 95% of drugs); q globularity 
descriptor (0.75 to 0.95 for 95% of drugs); "predicted polarizability (13.0 to 70.0 for 95% of drugs); predicted IC 50 value for blockage of HERG K + channels (concern <-5); 
predicted skin permeability (-8.0 to -1 .0 for 95% of drugs); "number of likely metabolic reactions (range for 95% of drugs is 0 to 1 5). 



to the ro3, while the respective percentage compliances 
for the various subsets were 72.28%, 92.11% and 100% for 
the drug-like, lead-like and fragment-like libraries. Among 
the individual computed parameters, the most remarkable 
was log«S wat , which was met by 75.74% of the compounds 
within the CamMedNP library, while this property shows 
a Gaussian distribution for the drug-like and lead-like 
subsets. Only 37.94% of the compounds fell within the 
respected range for the BLP Caco _ 2 criterion. The predicted 
apparent Caco-2 cell permeability, BLP Caco _ 2 (in nm s" 1 ), 
models the permeability of the gut-blood barrier (for non- 
active transport), even though this parameter is not often 
correctly predicted computationally [58]. The histograms 
of the predicted qualitative human oral absorption param- 
eter (in the scale 1 = low, 2 = medium and 3 = high) are 
shown in Figure 3. It was observed that 52.45% of the 



compounds in CamMedNP were predicted to have high 
human oral absorption. The predicted % human oral ab- 
sorption (on 0 to 100% scale) shows a similar trend, with 
41.06% of the compounds being predicted to be absorbed 
at 100%, and 57.96% of the compounds predicted to be 
absorbed at >90%. 

The size of a molecule, as well as its capacity to make 
hydrogen bonds, its overall lipophilicity, its shape and 
flexibility are important properties to consider when de- 
termining permeability. Molecular flexibility has been seen 
as a parameter which is dependent on the number of ro- 
tatable bonds (NRB), a property which influences the bio- 
availability in rats [58]. The distribution of the NRB for 
this dataset has been previously discussed [29] and re- 
vealed that the compounds within the CamMedNP library 
show some degree of conformational flexibility, the peak 
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Table 2 Percentage compliances of selected ADMET-related descriptors of total CamMedNP library in comparison with 
various subsets 





Library name 
Total library 


Drug-like 


Subset 
Lead-like 


Fragment-like 


Log B/B 


88.22 


99.55 


100.00 


100.00 


Bl^caco-2 (nm s" 1 ) 


43.95 


41.80 


39.04 


25.93 


S mo i (A 2 ) 


89.69 


99.55 


100.00 


95.06 


Smol.hfob (A 2 ) 


90.89 


100.00 


100.00 


100.00 


1/mol (A 3 ) 


90.95 


99.47 


99.81 


95.06 


LogS W at (S in mol L" 1 ) 


69.08 


89.57 


100.00 


97.53 


Log/( H sA 


85.77 


99.82 


100.00 


100.00 


MDCK 


49.94 


58.02 


56.73 


49.38 


lnd C oh 


95.20 


98.75 


99.62 


100.00 


Glob 


87.90 


96.97 


96.73 


83.95 


ro3 a 


47.22 


72.28 


91.92 


100.00 


Log H ERG 


55.02 


61.94 


73.27 


100.00 


logK p 


91.44 


95.99 


97.50 


97.53 


#metab 


79.61 


89.30 


97.31 


93.83 



The descriptors of the entries in the first column are defined in Table 1; ""percentage compliance to Jorgensen's Rule of Three. 



value for the NRB being between 1 and 2, while the aver- 
age value is 531 (Table 1). 

Prediction of blood-brain barrier penetration 

Too polar drugs do not cross the BBB. The blood/brain 
partition coefficients (log B/B) were computed and used as 
a predictor for access to the central nervous system (CNS). 
The predicted CNS activity was computed on a -2 (in- 
active) to +2 (active) scale and showed that only 1.85% of 
the compounds in the CamMedNP could be active in the 
CNS (predicted CNS activity >1). A distribution of the log 
B/B (Figure 4) shows a right-slanted Gaussian-shaped curve 
with a peak at -0.5 log B/B units (the same for all the 
standard subsets), with >88% of the compounds in the 



CamMedNP falling within the recommended range for the 
predicted brain/blood partition coefficient (-3.0 to 1.2). The 
MDCK monolayers are widely used to make oral absorp- 
tion estimates, the reason being that these cells also express 
transporter proteins, but only express very low levels of me- 
tabolizing enzymes [58]. They are also used as an additional 
criterion to predict BBB penetration. Thus, our calculated 
apparent MDCK cell permeability could be considered to 
be a good mimic for the BBB (for non-active transport). It 
was estimated that only about 50% of the compounds had 
apparent MDCK cell permeabilities falling within the 
recommended range of 25 to 500 nm s _1 for 95% of known 
drugs. This situation was not greatly improved in the drug- 
like and lead-like subsets (58% and 57%, respectively). 
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Figure 2 Distribution curves for compliance to Jorgensen's 'Rule of Three'. (A) calculated logS wat against count. (B) Predicted BIP Caco . 2 
against count. Blue = CamMedNP library, red = drug-like subset, green = lead-like subset and violet = fragment-like subset. 
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CamMedNP Drug-like Lead-like Fragment-like 



library 

Figure 3 Histograms showing the distribution of human oral 
absorption predictions. 



Prediction of dermal penetration 

This factor is important for drugs administered through 
the skin. The distribution of computed skin permeability 
parameter, log/<p, showed smooth Gaussian- shaped graphs 
centred at -2.5 log/<p units for all the four datasets 
(Figure 5), with approximately 91% of the compounds in 
the CamMedNP database falling within the recommended 
range for >95% of known drugs. The predicted maximum 
transdermal transport rates, J m (in u cm" 2 h" 1 ), were com- 
puted from the aqueous solubility («S wat ), the MW and skin 
permeability (K p ) using the relation (2): 



K p x MW x S wat 



(2) 



This parameter showed variations from 0 to 1,603 u 
cm" 2 h~\ with only about 1.39% of the compounds in 
CamMedNP having the predicted value of J n 

" 2 h" 1 . 



100 u 



cm 
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predicted logB/B 

Figure 4 Plot of the physico-chemical descriptor used to 
predict BBB penetration. Predicted log B/B against count. The x-axis 
label is the lower limit of the binned data, e.g. 0 is equivalent to 0.0 to 
1.0. Blue = CamMedNP library, red = drug-like subset, green = lead-like 
subset and violet = fragment-like subset. 
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Figure 5 Distribution curves for the predicted skin penetration 
parameter. Blue = CamMedNP library, red = drug-like subset, 
green = lead-like subset and violet = fragment-like subset. 



Prediction of plasma-protein binding 

The efficiency of a drug may be affected by the degree to 
which it binds to the proteins within the blood plasma. It is 
noteworthy that the binding of drugs to the plasma pro- 
teins (like human serum albumin, lipoprotein, glycopro- 
tein, a, (3 and y globulins) greatly reduces the quantity of 
the drug in the general blood circulation, and hence, the 
less bound a drug is, the more efficiently it can traverse cell 
membranes or diffuse. The predicted plasma-protein bind- 
ing has been estimated by the prediction of binding to 
human serum albumin; the \ogK HS A parameter recom- 
mended range is -1.5 to 1.5 for 95% of known drugs. 
Figure 6 shows the variation of this calculated parameter 
within the CamMedNP dataset, as well as for the standard 
subsets. This equally gave smooth Gaussian-shaped curves 
centred on -0.5 log/<T H sA units for all the four datasets. In 
addition, our calculations revealed that >85% of the com- 
pounds within the CamMedNP library are compliant to 
this parameter, indicating that a majority of the compounds 
are likely to circulate freely within the blood stream and, 
hence, have access to the target site. 

Metabolism prediction 

An estimated number of possible metabolic reactions has 
also been predicted by QikProp and used to determine 
whether the molecules can easily gain access to the target 
site after entering the blood stream. The average estimated 
number of possible metabolic reactions for the CamMedNP 
library was between five and six, while those of the standard 
subsets drop sequentially by one step in a progressive man- 
ner (Table 1). Even though some of the compounds are 
likely to undergo as many as up to 26 metabolic reactions 
due to the complexity of some of the plant secondary me- 
tabolites within the database (Figure 7), about 80% of the 
compounds are predicted to undergo the recommended 
number of metabolic steps (one to eight reactions), with 
the situation improving to around 90% and approximately 
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Figure 6 Distribution curves for predicted plasma-protein 
binding. Blue = CamMedNP library, red = drug-like subset, green = 
lead-like subset and violet = fragment-like subset. 



predictions for cardiac toxicity of drugs in the early stages 
of drug discovery [62]. In this work, the estimated or pre- 
dicted IC 50 values for blockage of this channel have been 
used to model the process in silico. The recommended 
range for the predicted log IC 50 values for blockage of the 
HERG K + channels (logHERG) is >-5. A distribution curve 
for the variation of the predicted logHERG is shown in 
Figure 8, which is left-slanted Gaussian-shaped curve with 
a peak at -5.5 logHERG units for both the total library and 
the drug-like subset, meanwhile the lead-like library rather 
peaks at -4.5 units. It was observed that, in general, this 
parameter is predicted to fall within the recommended 
range for about 55% of the compounds within the 
CamMedNP database, approximately 62% for the drug-like 
subset and around 73% for the lead-like subset. 



97% in the drug-like and lead-like subsets, respectively. 
From Figure 7, it can be observed that, except for the 
fragment-like subsets which peaks at two predicted meta- 
bolic reactions, the peak values for the number of predicted 
metabolic reactions were at three for all of the datasets. 

Prediction of blockage of human ether-a-go-go-related 
gene potassium channel 

Human ether-a-go-go-related gene (HERG) encodes a po- 
tassium ion (K + ) channel that is implicated in the fatal 
arrhythmia known as torsade de pointes or the long QT 
syndrome [59]. The HERG K + channel, which is best 
known for its contribution to the electrical activity of the 
heart which coordinates the heart's beating, appears to be 
the molecular target responsible for the cardiac toxicity of 
a wide range of therapeutic drugs [60]. HERG has also 
been associated with modulating the functions of some 
cells of the nervous system and with establishing and 
maintaining cancer-like features in leukemic cells [61]. 
Thus, HERG K + channel blockers are potentially toxic, and 
the predicted IC 50 values often provide reasonable 



Usefulness of the CamMedNP library 

The usefulness of the CamMedNP database in lead ge- 
neration has been exemplified with the docking and 
pharmacophore-based screening for potential inhibitors of 
a validated anti-malarial drug target in our laboratory, and 
the results will be published in a subsequent paper. It is 
important to mention that virtual screening results could 
provide insight and direct natural products chemists to 
search for theoretically active principles with attractive 
ADMET profiles, which have been previously isolated, but 
not tested for activity against specified drug targets (if sam- 
ples are absent). This resurrection' process could prove to 
be a better procedure for lead search than the random 
screening, which is a common practice in our Cameroon- 
ian laboratories. CamMedNP is constantly being updated; 
meanwhile, a MySQL platform (Cupertino, USA) to facili- 
tate the searching of this database and ordering of com- 
pound samples is under development within our group and 
will also be published subsequently. However, 3D structures 
of the compounds, as well as their physico-chemical prop- 
erties that were used to evaluate the DMPK profile, can be 




#metab 

Figure 7 Distribution of the predicted number of metabolic 
reactions for compounds in the CamMedNP. Blue = CamMedNP 
library, red = drug-like subset, green = lead-like subset and violet = 
fragment-like subset. 
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Figure 8 A plot of the predicted logHERG values for the CamMedNP 
and standard subsets. Blue = CamMedNP library, red = drug-like 
subset, green = lead-like subset and violet = fragment-like subset. 
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freely downloaded as additional files accompanying this 
publication (see Additional file 1, Additional file 2, 
Additional file 3, Additional file 4). In addition, informa- 
tion about compound sample availability can be obtained 
on request from the authors of this paper or from the 
pan- African Natural Products Library (p-ANAPL) project 
[63,64]. 

Conclusion 

Modern drug discovery programs usually involve the 
search for small molecule leads with attractive phar- 
macokinetic profiles. The presence of such within the 
CamMedNP library is of major importance and, therefore, 
renders the database attractive, in addition to the already- 
known properties (drug-like, lead-like fragment-like and 
diverse). This is an indication that the 3D structures of nat- 
urally occurring compounds within the CamMedNP could 
be a good starting point for docking, neural networking 
and pharmacophore-based virtual screening campaigns, 
thus rendering the CamMedNP as a useful asset for the 
drug discovery community. 3D structures of the com- 
pounds, as well as their physico-chemical properties that 
were used to evaluate the DMPK profile of the 
CamMedNP library, can be freely downloaded (for non- 
commercial use) as additional files which accompany this 
publication (see Additional file 1, Additional file 2, 
Additional file 3, Additional file 4). The physical samples 
for testing are available at the various research laboratories 
in Cameroon in varying quantities. Questions regarding 
the availability of the compound samples could be 
addressed directly to the authors of this paper. Otherwise, 
the samples could be obtainable from the p-ANAPL con- 
sortium, which has a mandate to collect samples of NPs 
from the entire continent of Africa and make them avail- 
able for biological screening. This network is being set up 
under the auspices of the Network for Analytical and Bio- 
assay Services in Africa [63,64]. 

Additional files 
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